By Lance Martin
TL;DR: Agents need context to perform tasks. Context engineering is the art and science of filling the context window with just the right information at each step.
LLMs are like a new kind of operating system. The LLM is the CPU, and its context window is the RAM.
Context engineering is the "...delicate art and science of filling the context window with just the right information for the next step." - Andrej Karpathy
Agents interleave LLM invocations and tool calls for long-running tasks.
"Context engineering is effectively the #1 job of engineers building AI agents." - Cognition
Saving context outside the context window to help an agent perform a task.
Agents take notes and save information externally, similar to humans.
Example: Anthropic's multi-agent researcher saves plans to memory to avoid truncation.
Agents remember things across many sessions.
Examples: Reflexion, Generative Agents, ChatGPT, Cursor, Windsurf.
Pulling context into the context window to help an agent perform a task.
Agents read from scratchpads via tool calls or state exposure.
Agents select relevant memories: episodic, procedural, or semantic.
Challenges: Ensuring relevance and avoiding undesired retrieval.
RAG applied to tool descriptions improves tool selection accuracy.
RAG is central to context engineering, especially in code agents.
Retaining only the tokens required to perform a task.
Summarize agent interactions to manage token use. Example: Claude Code auto-compact.
Filter or prune context using heuristics or trained models.
Example: Removing older messages or using Provence for QA.
Splitting context up to help an agent perform a task.
Split context across sub-agents for separation of concerns.
Use sandboxes (e.g., CodeAgent) to isolate token-heavy objects.
Runtime state objects isolate context with schema design.
Example: Expose only necessary fields to the LLM each turn.
Context engineering strategies:
Understanding these patterns is key to building effective agents.